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Abstract 

Given integers n, m = [/3riJ and a probability measure Q on {0, 1, . . . , m}, consider the 
random intersection graph on the vertex set [n] = {1, 2, . . . , n}, where i, j G [n] are declared 
adjacent whenever S{i) fl S{j) ^ 0. Here 5(1), . . . ,S{n) denote iid random subsets of [m] 
with the distribution P{S{i) — A) = (|™|) Q(|j4|), A C [m]. For sparse random intersection 
graphs we establish a first order asymptotic as n — oo for the order of the largest connected 
component A'^i = n(l — Q(0))p + op{n). Here p is an average of nonextinction probabilities 
of a related multi-type Poisson branching process. 

1 Introduction 

Let Q be a probability measure on {0, 1, . . . ,m}, and let 5i, . . . , 5„ be random subsets of a 
set W = {wi, . . . ,Wm} drawn independently from the probability distribution P(5j = A) = 
{\A\) ^Q(I^I)' ^ for i = 1, . . . , n. A random intersection graph G{n,m,Q) with a vertex 

set V = {fl, . . . ,Vn} is defined as follows. Every vertex Vi is prescribed the set S{vi) = Si and 
two vertices Vi and Vj are declared adjacent (denoted Vi ~ Vj) whenever S{vi) fl S{vj) ^ 0. The 
elements of W are sometimes called attributes, and S{vi) is called the set of attributes of Vi. 
Random intersection graphs G{n,m,Q) with the binomial distribution Q ~ Bi{m,p) were in- 
troduced in Singer-Cohen [T5] and Karohski et al. [13], see also |10j and [16]. The emergence 
of a giant connected component in a sparse binomial random intersection graph was studied by 
Behrish [2], for m = [n"J , a ^ 1, and by Lageras and Lindholm |14j . for m = [/3nJ , where /? > 
is a constant. They have shown, in particular, that, for a > 1, the largest connected component 
collects a fraction of all vertices whenever the average vertex degree, say d, is larger than 1 -|- e. 
For d < 1 — e the order of the largest connected component is O(logn). 

The graph G{n, m, Q) defined by an arbitrary probability measure Q (we call such graphs 
inhomogeneous) was first considered in Godehardt and Jaworski [11], see also [T2j. Deijfen and 
Kets [8], and Bloznelis [3] showed (in increasing generality) that the typical vertex degree of 
G(n, m, Q) has the power law for a heavy tailed distribution Q. Another result by Deijfen and 
Kets [8| says that, for m k, /Sn, graphs G(n, m, Q) posses the clustering property. 
The emergence of a giant connected component in a sparse inhomogeneous intersection graph 
with n = o{m) (graph without clustering) was studied in [1]. The present paper addresses 
inhomogeneous intersection graphs with clustering, i.e., the case where m « Pn. 



2 Results 



Given /3 > 0, let {G{n,mn,Qn)} be a sequence of random intersection graphs such that 

limmnn^^ = f3. (1) 

n 

We shall assume that the sequence of probability distributions {Qn} converges to some proba- 
bility distribution Q defined on {0, 1,2,...}, 

limQn{t) = Q{t), Vt = 0,l,..., (2) 

n 

and, in addition, the sequence of the first moments converges, 

lim y tQn{t) = y tQ{t) < oo. (3) 

t>l t>l 

2.1. D GgrGG distribution. Let 1^ — {^ii • • • ^ ^n} denote the vertex set of Gri — Giji^ Qn) 
and let dn{vi) denote the degree of vertex Vi. Note that, by symmetry, the random variables 
dn{vi)-, ■ ■ ■ , dn{vn) have the same probability distribution, denoted D„- In the following propo- 
sition we reccill ci known fact cibout the aiSymptotic distribution of Dn- 



Proposition 1. Assume that ^\), ([^ and (0) hold. Then we have as n ^ oo 

F{Dn = k)^Y. ^ = 0, 1, . . . . (4) 



t>0 

Here a = /S"^ Et>o *<3W- 

Roughly speaking, the limiting distribution of Dn is the Poisson distribution 'P(A) with random 
parameter A = aX, where X is a random variable with the distribution Q. In particular, for 
a heavy tailed distribution Q we obtain the heavy tailed asymptotic distribution for Dn. For 
Q ~ Bi{m,p), Q is shown in jl6j . For arbitrary Q, is shown (in increasing generality) in 
[8] and [5]. 

2.2. The largest component. Let Ni{G) denote the order of the largest connected component 
of a graph G ( i.e., Ni{G) is the number of vertices of a connected component which has the 
largest number of vertices). We are interested in a first order asymptotic of Ni{G{n,mn,Qn)) 
as n — )■ oo. 

The most commonly used approach to the parameter Ni{G) of a random graph G is based on 
tree counting, see [U], [7]. For inhomogeneous random graphs it is convenient to count trees 
with a help of branching processes, see [6j. Here large trees correspond to surviving branching 
processes and the order of the largest connected component is described by means of the survival 
probabilities of a related branching process. 

In the present paper we use the approach developed in [6]. Before formulating our main result 
Theorem[T]we will introduce some notation. Let X = Xq^^ denote the multi-type Galton- Watson 
branching process, where particles are of types t € T = {1,2, . . . } and where the number of 
children of type t of a particle of type s has the Poisson distribution with mean (s — l)tqtl3~^ . 
Here we write qt = Q{t), t €T. Let X{t) denote the process X starting at a particle of type t, 
and |A'(t)| denote the total progeny of X{t). Let pq^/iit) = P(|,^(t)| = oo) denote the survival 
probability of the process X{t). Write Pg^t) = F{\X{t)\ > k), 



2 



Note that for every t £ T we have Pq p{t) i PQ,i3{t) as A; t oo (by the continuity property of 
probabiUties). Hence, p^''\Q) i p{Q) as A; t oo. 

Theorem 1. Let (3 > 0. Let {m^} he a sequence of integers satisfying ([ip. Let Q, Qi, Q2, ... be 
probability measures defined on {0, 1,2...} such that X^^'g Qn{i) = for n = 1, 2, . . . . Assume 
that ^ and holds. Then we have as n —> 00 

A^i(G(n,m„,Q„)) =n(l-g(0))p + op(n). (5) 

Here p = PQ*^p*, for (5(0) < 1, and p = 0, otherwise. Q* denotes the probability measure on 
{1,2,...} defined by Q*{t) = (l - Q(0))"^Q(t), t> 1, and p* = /3(l - g(0))"\ 

Notation op{n). We write r]n = op{l) for a sequence of random variables {??n} that converges 
to in probabihty. We write t/„ = op{n) in the case where rjnU^^ = op{l). 
Remark 1. The correspondence p > Q EZ^^ > c > 1 estabhshed for binomial random intersec- 
tion graphs in [2], ^14j can not be extended to general inhomogeneous graphs G{n,mn,Qn)- To 
see this, consider the graph obtained from a binomial random intersection graph by replacing 
S{vi) by for a randomly chosen fraction of vertices. This way we can make the expected 
degree arbitrarily small, and still have the giant connected component spanned by a fraction of 
unchanged vertices. 

Remark 2. The kernel {s,t) — )• (s — l)t/3~^ of the Poisson branching process which determines 
the fraction p in the case ~ /3n differs from the kernel {s,t) st which appears in the case 
n = o{mn), see [1]. 

3 Proof 

The section is organized as follows. Firstly we collect some notation and formulate auxiliary 
results. We then prove Theorem [TJ The proofs of auxiliary results are given in the end of the 
section. 

Let W be a finite set of size \W'\ = k. Let B,H be subsets of W of sizes \B\ = b and \H\ = h 
such that B r\ H = 1). Let j4 be a random subset of W uniformly distributed in the class of 
subsets of W of size a. Introduce the probabilities 

p(a,6,A;) = P(^nB/0), 

pi(a, 6, k) =V{\Ar\B\ = 1), ^2(0, b, k) = V{\A r\B\>2), 
p{a,b,h,k) = P{\AnB\ = 1, AnH = ^), 
pi{a,b,h,k) =P(|^nB| = 1, 

Lemma 1. Let k > A. Denote >c = ah/k and x' = ab/{k — a). For a + b <k we have 



x(l — x') < pi{a,b,k) < p{a,b,k) < >c, (6) 

P2{a,b,k) < 2^1x2. (7) 

Denote V = (a — l)h/ (k — b) . For a + h + h < k we have 

x(l - x' - x") < p(a, 6, /i, k) < K. (8) 

pi{a,b,h,k) < >Ch^. (9) 
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Given integers n, m and a vector s = (si, . . . , s„) with coordinates from the set {0, 1, ... , m}, 
let S{vi), . . . , S{vn) be independent random subsets of Wm = {wi, . . . , Wm} such that, for every 
1 < i < n, the subset S{vi) is uniformly distributed in the class of all subsets of Wm of size Sj. 
Let Gsin, m) denote the random intersection graph on the vertex set Vn = {vi, . . . , v^} defined 
by the random sets S{vi), . . . , S{vn)- That is, we have Vi vj whenever S{vi) fl S{vj) 7^ 0. 

Lemma 2. Let M > be an integer and let Q be a probability measure defined on 
[M] = {!,..., M}. Let {m„} be a sequence of integers, and {s„ = . . . , s„„)} be a se- 
quence of vectors with integer coordinates Sni G [M], 1 < i < n. Let ut denote the number of 
coordinates ofsn attaining the value t. Assume that, for some integer n' and a sequence {en} 
C (0, 1) converging to zero, we have, for every n > n' , 

max \{nt/n) - Q{t)\ < En, (10) 
l<t<M 

|m„(/3n)-^-l| (11) 
Then there exists a sequence {e^}n>i converging to zero such that, for n > n' , we have 

P{\Ni{GsAn,mn))-npQ,p\ > < e^. (12) 
Several technical steps of the proof of Lemma [2] are collected in the separate Lemma [3l 

Lemma 3. Assume that conditions of lemma\^ are satisfied. For any function satisfying 
u){n) — )• +CXD as n — > cxD bounds [24\ ), [25\) . and [27\ ) hold true. 

Proof of TheoremUl Write, for short, G„ = G{n,mn,Qn) and A''i = Ni{G{n,mn,Qn))- Given 
t = 0, 1, . . . , let nt denote the number of vertices of G with the attribute sets of size t. Write 
Qnt = Qn{t) and qt = Q{t), and q^ =Q{t). 

Note that vertices with empty attribute sets are isolated in G. Hence, the connected components 
of order at least 2 of G belong to the subgraph G[oo] C G induced by the vertices with non-empty 
attribute sets. 

In the case where q^ = 1, we obtain from ^ that the expected number of vertices in G[oo] 
E(n — no) = n{l — qno) = o{n). This identity implies A''i = op{n). We obtain ([5]), for go = 1- 
Let us prove ([5]) for go < 1- Let G^M],n denote the subgraph of G„ induced by the vertices with 
attribute sets of sizes from the set [M]. In the proof we approximate Ni{Gn) by Ni{G^M],n) and 
use the result for Ni{G\^M],n) shown in Lemma [2j 

We need some notation related to G^M],n- The inequality go < 1 implies that, for large M, 
the sum q^^^ := qi + ■ ■ ■ + qM ~ 1 — go is positive. Given such M, let Q^^ be the probability 
measure on [M], which assigns the mass g^j^ = qt/q[M] to t G [M]. Denote p[M] = PQI.j,i3m^ 
where /3m = f3/q[M]- Clearly, 13m converges to /3* as M ^ 00, and we have 

Vi > 1 lim qljt = ql and lim ^ tql^ = ^ tql < 00. (13) 

t>i t>i 

It follows from (|13|) that 

limP[Af] = PQ*,/3*- (14) 
For the proof of ()14p we refer to Chapter 6 of [6]. 

We are now ready to prove ([5]). For this purpose we combine the upper and lower bounds 
Ni > n(l - go)/OQ*,/3* - op(n) and iVi < n(l - go)/OQ*,/3* + op(n). 
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We give the proof of the lower bound only. The proof of the upper bound is almost the same as 
that of a corresponding bound in [4J, see formula (56) in |4j. 
In the proof we show that, for every e G (0, 1), 

P(A''i > n(l - qo)pQ*,i3* - 2en) = 1 - o(l) as n ^ oo. (15) 

Fix £ G (0, 1). In view of ()14p we can choose M such that 

Pq*,/3* - e < P[M] < PQ'-,i3'- + (16) 
We apply Lemma [2] to G[M],n conditionally given the event 

An = \ max \nt — oml < n5„ + n^^^}. 

l<t<M 

Here 6n = inaxi<t<M \Qnt — Qt\ satisfies 6n = o(l), see In addition, we have 

1 - P{An) < P( max \nt - qntn\ > n^/^) 
l<t<M 

< P(l«t-9ntn| >n2/3) 

l<t<M 

< Mn,-^/^ = o(l). 

In the last step we have invoked the bounds P{\nt — qntn\ > 'n?^'^) < n^^/^, which follow by 
Chebyshev's inequality applied to binomial random variables rit, t [M]. Now, combining the 
bound, which follows from Lemma [21 

P{\Ni{GiM],n) - npAil > n£\An) = oil) (17) 
with (flGj) and the bound P{An) = 1 — o(l), we obtain 

P{\Ni{GiM],n)-npQ*,p* \ > 2ne) = o(l). 
Finally, (|15p follows from the obvious inequality Ni > iVi(G[Af],n)- 

□ 

Proof of Lemma\^ The proof consists of two steps. Firstly, we show that components of order 
at least v?/^ contain npq^p + op{n) vertices in total. This implies the upper bound for A^i = 

Ni{G-sSn,mn)) 

Ni<npQ^p + op{n). (18) 

Secondly, we prove that with probability tending to one such vertices belong to a common 
connected component. This implies the lower bound 

Ni > npQ^iB - op{n). (19) 

Clearly, (fT8|) . (fT9]l yield (fT2]l . Before the proof of (fT8|) . (fT9]l . we introduce some notation. 
Notation. Denote p = pq^p and write qt = Q{t), t € [M]. In what follows, we drop the subscript 
n and write m = nin, V = Vn, W = Wm, G = G-s„{n, m). We say that a vertex f G F is of type 
t if the size Si] — |5'('i')| of its attribute set Siv) is t. An edge v! ~ u" of G is called regular if 
\S(u') n S(u")\ = 1. In this case u' and u" are called regular neighbours. The edge u' ~ u" is 
called irregular otherwise. We say that Vi is smaller than Vj whenever i < j. Given v (z V, let 
Cy denote the connected component of G containing vertex v. 
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In order to count vertices of Cy we explore this component using the Breath-First Search pro- 
cedure. 

Component exploration. Select v E: V. In the beginning all vertices are uncoloured. Colour v 
white and add it to the list Ly (now consists of a single white vertex v). Next we proceed 
recursively. We choose the oldest white vertex in the list, say u, scan the current set of uncoloured 
vertices (in increasing order) and look for neighbours of u. Each new discovered neighbour 
immediately receives white colour and is added to the list. In particular, neighbours with 
smaller indices are added to the list before ones with larger indices. Once all the uncoloured 
vertices are scanned colour u black. Neighbours of u discovered in this step are called children 
of u. We say that u' G Ly is older than u" € L„ if u' has been added to the list before u". 
Exploration ends when there are no more white vertices in the list available. 
By L* = {v = ui, U2, Us, . . .} we denote the final state of the list after the exploration is 
complete. Here i < j means that Ui has been discovered before Uj. Clearly, L* is the vertex set 
of Cy. Denote Ly{k) = {ui € : i < k}. Note that |L^,(A:)| = min{/c, By Uj* we denote 

the vertex which has discovered uj {uj is a child of Uj*). Introduce the sets, 

Dk = Ui<j<kSiuj), S'{ui) = S{ui)\Di_i, k>l, i>2, (20) 

and put Do = 0, S'{ui) = S{ui). 

Regular exploration is performed similarly to the 'ordinary' exploration, but now only regular 
neighbours are added to the list. We call them regular children. A regular child u' of u is 
called simple if S{u') \ S{u) does not intersect with S{e) for any vertex e that has already been 
included in the list before u' . Otherwise the regular child is called complex. Simple exploration 
is performed similarly to the regular exploration, but now simple children are added to the list 
only. 

In the case of regular (respectively simple) exploration we use the notation L^, L^*, LJJ(fc), -D^, 
S'^{ui) (respectively L^, L^*, L^{k), Df., S'^{ui)) which is defined in much the same way as above. 
Similarly, i* denotes the number in the list {L'y or depending on the context) of the vertex 
that has discovered Ui (ui is a child of Uj*). For a member uj of the list L^* = {v = ui,U2, ■ ■ ■} 
we denote H{uj) = (Uj*<,.<jS'(nr))\-D|*. Consider the simple exploration at the moment where 
the current oldest white vertex, say Ui of evolving list = {v = ui,U2, . . .} starts the search of 
its simple children. Let Ui = {vj^^, . . . ,Vj^, . . . vj^} denote the current set of uncoloured vertices 
(the set of potential simple children). Here ji < j2 < • • • < jk- Firstly, allow Ui to discover 
its simple children among {vj^, . . . ,Vj^-i}. Define the set Hi{vj^) = (LiueLS{u)) \ Df, where L 
denotes the set of current white elements of the list that are younger than Ui. In particular, 
L includes the simple children of Uj discovered among vj-^, . . . ,Vj^-i. Observe that any u' € Ui 
becomes a simple child of Ui whenever it is a regular neighbour of Ui and Hi{u') fl S{u') = 0. 

\S{v!) n S{ui)\ = 1 and S{v!) n Hiiv!) = 0. (21) 

Observe that for any member of the list Uj G L^* we have H{uj) = Hj*{uj). 
Note that irregular neighbours discovered during regular exploration receive white colour, but 
are not added to the list L^. Similarly, irregular neighbours and complex children discovered 
during simple exploration receive white colour, but are not added to the list Lf,. Note also that 
LJ* does not need to be a subset of L^*. 

Let uj{n) be an integer function such that uj{n) — > +oo and u}{n) = o{n) as n — > oo. A 
vertex v V is called big (respectively, br- vertex and bs- vertex) if > uj{n) (respectively, 
> Lo{n) and |LJ*| > w(n)). Let B, B^, and i?* denote the collections of big vertices, br- 
vertices, and bs-vertices respectively. Clearly, we have B^, B^ C B. Note that in order to decide 
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whether a vertex v is big we do not need to explore the component completely. Indeed, we 
may stop the exploration after the number of coloured vertices reaches oj{n). In what follows 
we assume that the exploration was stopped after the number of coloured vertices had reached 
u]{n) (in this case v ^ B) or ended even earlier because the last white vertex of the list failed to 
find an uncoloured neighbour (in this case v ^ B). 
The upper bound. Fix uj{-). We show that 

\B\-np= op{n). (22) 

Note that ([22]) combined with the simple inequality A^i < max{a;(n), implies (fT8]l . We 
obtain (1221) from the bounds 



\B\-\B'\ = op{n), (23) 
\B'\-np = op{n). (24) 

(j24p is shown in LemmaO (|23p follows from the bound E(|i?| — \B^\) = o(n). In order to prove 
this bound we show that 

E\B'\-np = o{n), (25) 
E|S| < np + o{n). (26) 

(|25|) is shown in Lemma O (|26|) follows from the bounds 

B\B''\ <np + o{n), (27) 
E\B\B'''\= o{n). (28) 

([27]) is shown in Lemma El In order to show ([28]) we write B\B\B''\ = ^^^y P{v e B\B'') 
and invoke the bounds, which hold uniformly in v ^V, 

P{v £ B\B'') =0{uj{n)n-^). (29) 

In the proof of ()29p we inspect the list L^{u}{n)) and look for an irregular child. The probability 
that given n, £ L^(uj{n)) is an irregular child is 0(n~^), see d?]). Now ()29p follows from the fact 
that L^{uj{n)) has at most u}{n) = o{n) elements. The proof of (|23p is complete. 

The lower hound. We start with a simple observation that whp each attribute w \s shared by 
at most O(lnn) vertices. Denote f{w) = Ylivev^iwf^siv)}-, w € W. We show that the inequality 

max /(id) < 2Mlnn (30) 

holds with probability 1 — o(l). Since f{w) is a sum of independent Bernoulli random variables 
with success probabilities at most M/m, Chernoff's inequality implies P{f{w) > 2Mlnn) < 
CM,i3n~'^. Hence, the complementary event to pUj) has probability 

P(max/(?i;) > 2Mlnn) < V P{fiw) > 2Mlnn) = o(l). 

Let us prove (|19p . Fix e € (0, 1). For each t S [M] choose [n^e] vertices of type t and colour 
them red. Let G' denote the subgraph of G induced by uncoloured vertices, and let Ci,C2, . . . 
denote the (vertex sets of) connected components of G' of order at least n^/^. Observe, that 
the number, say k, of such components is at most (1 — e)n^^^. We apply ([22]) to the intersection 
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graph G' and function uj{n) = [n^/^] and obtain | Ui>i Cj| = (1 — e)npQ^iji + op{n), where 
(3' = — s)~^. We show below that with a high probabihty aU vertices of Uj>iCi belong to a 
single connected component of the graph G. Hence, A''i > (1 — e)pQ^pi + op{n). Letting e — )• 
we then immediately obtain lower bound (jl9p . 

We assume that G is obtained in two steps. Firstly, the uncoloured vertices generate G' , and, 
secondly, the red vertices add the remaining part of G. Let us consider the second step where 
the red vertices add their contribution. Write lij = 1 if Cj and Gj are not connected by a path 
in G, and Ijj = otherwise. Let N = Yl,i<i<j<k^ii denote the number of disconnected pairs. 
Clearly, the event N = implies that all vertices from Uj>iCj belong to the same connected 
component of G. Therefore, it suffices to show that P(A^ = 0) = 1 — o(l). For this purpose we 
prove the bound P(A^ > 1\G') = o(l) uniformly in G' satisfying (j3U|) . see (|32p below. 
In what follows we assume that ([30]) holds. Let /(Cj) = Uy^CiS{v) denote the set of attributes 
occupied by vertices from Cj. Here /(Cj) fl f{Cj) = 0, for i ^ j. Note that if a red vertex finds 
neighbours in and Gj simultaneously then it builds a path in G that connects components Gi 
and Gj. Clearly, only vertices with attribute sets of size at least 2 (i.e., vertices of types 2, 3, ... ) 
can build such a path. The probability of building such a path is minimized by vertices of type 
2. This minimal probability is 

j/(g»)|x|/(C,-)| _ 
^'^ m(m — 1) 

Note that ([30]) combined with the inequality \Gi\ > [n^/^] implies |/(Ci)| > n'^/^{2M Inn)-^ . 
Hence, 

1 _ 
- 2M2 (mlnn)2 ~- 

Let r := [n2ej + • • • + [nA/eJ denote the number of red vertices of types 2, 3, ... . Observe that, 
for large n, (jlOp implies r ~ eq'n. Here q' = q2 + ■ ■ ■ + qM- In particular, we have 

= l\G') < (1 - PijY < (1 - p.Y < e-f*^ (31) 

Herep*r > c'n^/^ (In n)^^^ ^nd the constant c' depends on /?, M, and q'. Next, we apply Markov's 
inequality to the conditional probability 

P(iV > 1\G') < E{N\G') = P{Iij = 1\G'). 

l<i<j<k 

Invoking (|3ip and the inequality A; < (1 — e)n^^^ we obtain 

P{N > 1\G') < k^e-P*'' < j^2/3g-c'ni/3i„-2„_ ^22) 

□ 

Proof of Lemma O Throughout the proof we use the notation of Lemma [2j 
Fix u}{-). Given < e < 1, let 3^"*"^ and y~'^ be multi-type Galton- Watson processes with 
type space [M] where the number of children Y^^ (^s7^) °f type t of a particle of type s has 
binomial distribution Bi(^[qtn{l + e)\,pst{^ + e)) and Bi(^lqtn{l — e)\,pst{^ — e)) respectively. 
Here pst := {s - l)t{pn)-\ 

Let (and be multi-type Galton- Watson process with type space [M] where the number 
of children X^*" (and X~^^) of type t of a particle of type s has the Poisson distribution with 
mean X^til + e) (and Xstil - e))- Here Xst := (s - l)tqt/3~^. 
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Given a multi-type G-W process Z with type space [M], by Z(t) we denote the process starting 
at a particle of type t, \Z(t)\ denotes the total progeny of p{Z,t) := P{\Z{t)\ = oo) and 

■.= P{\Z{t)\>k). 

It is known, see, e.g. inequality (1.23) in [1], that the total variation distance between the 
binomial distribution Bi(r,p) and the Poisson distribution with the same mean is at most p. 
Therefore, by a coupling of the offspring numbers of binomial and Poisson branching processes 
we obtain 

p("("»(3^+^^) = p("("))('^'^'',i) + o(t^N/™), (33) 

p'^''^''^\y-',t) = p^'^'^'^^^X-'" ,t) + o{io{n)/n). (34) 

Here e' = (1 + e)^ — 1 and e" = 1 — (1 — e)^. Letting n — > oo we obtain, 

p{-'H)^jY+'',t) ^ p{X+'',t), p(^("»(A'-"",t) ^ p{X-'",t). (35) 

Furthermore, letting e J, we obtain 

p{X+'\t) ^ pQ,fsit), p{X-'\t) ^ pQ,^(t). (36) 

Proof of ^2l\) . We shall show that 

P{v G B') < pQ^p{s, + 1) + o(l). (37) 

uniformly 'm.v . Collecting these bounds in the identity E|i?^'| = X^^gy P{v G B"^) and using 
(jlOp we then obtain ()27p . Therefore, it suffices to prove ()37p . In the proof we couple regular 
exploration starting at v with the process y^^{sv + \). Let denote the number of regular chil- 
dren of type t discovered by G = {u = ui, ■ ■ • }• Let nu denote the number of uncoloured 
vertices of type t at the moment, when Ui starts exploration of its neighbourhood. Then has 
the binomial distribution Bi{nit,p'if) with success probability p[^ = pi(t, \S'^{ui)\, \W \ Di^i\). 
Note that for large n we have 

nu < [qtnil + e)\, p^t < \S" iui)\til3n)-\l + e). (38) 

The first inequality follows from (|lUp . The second inequality follows from ^ combined with the 
inequalities 

m > \ Dl_^\ = m- \Dl_^\ >m- Muj{n) = m- o(m). (39) 

In addition, in view of ()lip . we can replace m by /3n in ()38p . ()38p shows that the parameters of 
the binomial distribution of are smaller than the corresponding parameters of the offspring 
distribution of the branching process y^'^{sv + 1). Therefore, particles of the branching process 
produce at least as many children of each type as the vertices Ui, i < uj{n). Note that v = ui 
corresponds to a particle of type |S"^(?;)| = St, -|- 1 of the branching process while remaining 
vertices Uj, i > 2 correspond to particles of types Su^ = \S{ui)\ respectively. Hence, we have 

V{v G B-") < P{\y+'{s^ + 1)1 > oj{n)). (40) 

([iOD in combination with ([33]), dM]) and ([36]) imphes §7^. 

Proof of l[25\) . Given v £ V, we start simple exploration at v. Let Kt (It) denote the number of 
complex (irregular) children of type t discovered by the exploration until the list L^{oj{n)) was 
completed. We put a label on v whenever maxtjETj, It} > oj{n). 
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Let A denote the set of labeled vertices and p'^ := P{v B^\v ^ A) be the probability that the 
simple exploration of unlabeled vertex v discovers at least uj{n) vertices. We show below that 

P{v eA) = 0{n-^), (41) 
P'v = PqA^v + 1) - 0(1). (42) 

It follows from dH]), ^ that 

P(^; E B') = + O(n-i) = pQ^p{s, + 1) + 0(1). (43) 
Invoking the latter identity in the expression Eji?'^| = X^^gy P(f G B^) we obtain (|25p . 
Proof of ^2\ ). Given e > we show that for large n 

P{\y+'isv + 1)1 > io{n)) > > P(|:^-^(s, + 1)| > uj{n)). (44) 
These inequalities in combination with (j33ti36p imply (j42p . 

In order to generate events of probability p'^ we use rejection sampling. In the course of ex- 
ploration we keep track of the number of coloured vertices and interrupt the exploration at the 
moment when this number exceeds 3uj{n). Exploration is rejected if it is interrupted before the 
list Ll{io{n)) is completed. Otherwise it is accepted. Clearly, p'^ is the probability that the list 
Lf,{u}{n)) of an accepted exploration has collected all a;(n) elements. 

In the proof of ()44p we couple the simple exploration process with branching processes y~^{sy+l) 
and y~^'^{sv + 1) so that the number of simple children of type t of the vertex v is at least (most) 
as large as the number of particles of type t in the first generation of y~^{sy + 1) {y^^{sy + 1)), 
t G [M]. In the further steps of exploration the number Yt{u) of simple children of type t 
discovered by a particle u € Ll{uj{n)) \ {v} is at least (most) as large as the number of children 
of type t produced by the coresponding particle of type Su of the process y~^ (3^^^)- 
To make sure that such a coupling is possible we fix u = Uj G L*(u;(n)) and count its simple 
children. Recall that Ui selects simple children from the current set of uncoloured vertices. These 
are checked one after another in increasing order, and each newly discovered simple child is added 
to the list before the next uncoloured vertex is checked. At the moment when a vertex g is 
checked, its probability to be a simple child of u is pi{g) = p{\S{g)\, |S"*(n)|, \Hi{g)\, \W\Di-i\). 
It is a conditional probability given {S{u'), u' G L^}. Here is the set of vertices that have 
been added to the list before g was checked. Note that, as far as the probability of the event 
{v G B'^} = {Ll{uj{n)) = uj{n)} is considered, we may safely assume that \Di^i\,\Hi{g)\ < 
M{uj{n) — 1). It follows from these inequalities and ^ that for large n we have 

\S'^(u)\Sa , X , s \S"^(u)\Sa, , s 

I ^ 9.(i-e) <p,{g) <^ ^^(1 + e). (45) 



m m 



In addition, in view of (|lip . we can replace m by /3n in the denominator. Let n*^ denote the 
number of uncoloured vertices of type t at the moment when u = Ui starts search of its simple 
children. Until the exploration is not interrupted we have n*^ > nt — 3a;(n). For large n this 
inequality implies n*^ > (1 — e/2)nt. Invoking (jlOp we obtain 

qtn{l -e)< < qtn{l + e) t G [M] . (46) 

It follows from ()45l H6]l that we can couple Yt{u) with binomial random variables 
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so that almost surely we have (u) < Yt{u) < l^^(n). These inequalities imply (|44p . 
Proof of gip. We write P{v e A) < Ete[J\/] ^ ^(^)) + ^(^t ^ ^("')) and show that 

P{Kt > uj{n)) = o{n-^), P(/t > uj{n)) = o{n-^). (47) 

We prove the first bound only. The proof of the second bound is much the same. Given 

1 < uj{n), the number of complex children of type t discovered by Ui € is the sum of at most 
nt independent Bernoulli random variables each with success probability at most 

p* = (M, M, Muj{n),m - Muj{n)) < cM^m'"^, 

see dH]). Therefore, Kt is at most sum of ntu;{n) independent Bernoulli random variables with 
success probability p* . In particular, we have 

P{Kt > a;(n)) < P(^ > uj{n)), (48) 

where ^ ~ Bi(ntU}{n),p*). By Chebychev's inequality 

P(C > uj{n)) < {uj{n) - E^)~2var^ = 0{n-^). (49) 

In the last step we invoke the simple bounds 

Var^ < = ntuj{n)pl = 0{u?{n)n~^) = o{uj{n)). 

([i8l) and (09]) imply the first bound of (liTl) . 

Proof of ^24\ )- It suffices to establish (j24p for one particular function uj, because for any other 
defined by another such function uj, we have 

\B'\-\B'\=op{n). (50) 

To see this write - |^*| < IB" U B'l - IB" n B'\ and observe that B" U B" and B" n B' 
represent sets of bs- vertices defined by the functions wi = min{a;,ci;} and uj2 = maxlu , iJj} 
respectively. An application of ()25p to ui and cj2 yields the bound E(|i?*| — \B^\) = o{n). This 
bound implies ([50]) . 

We show (|24p for uj{n) = [InnJ. For this purpose we prove the bound for the variance 

E\B'\^ - (E\B'\f = o{n'^), (51) 

which tells us that — E|i?''| = op{n). In particular, ([5T]l combined with ([25]) shows ([2^ . 
In the proof of (|5ip we use the observation that the first (jj{n) steps of any two explorations start- 
ing at distinct vertices are almost independent. More precisely, we show below that uniformly 
in {u, v} dV 

P(n, veB')= PQ,p{su + l)pQ,p{sv + 1) + o{l). (52) 
It follows from ([52D that 

2 ^{u,v^B') = PQAsu + l)pQAsv + ^) + o{n'). (53) 

{u,v}CV u,v£V 

= n p + o{n ). 

In the last step we use ([TO]) . Observe, that the left-hand sum of ([53]) is the expected value of 

2 t.}cy ^{",fG-B^} ~ \B^\^ — \B'^\. Therefore, from ([55]) we obtain 

^B'\^ = n^f + E|5"| + o(r?). 
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This identity combined with ()25p implies (|5ip . 

Let us prove ()52p . We first explore u and then v. In each case we stop simple exploration after 
the number of vertices in the corresponding list reaches w(n). Note that with a high probability 
these two explorations do not meet. Indeed, let (r„) denote the set of vertices coloured by 
the first (second) exploration and let V. denote the event that the second exploration does not 
encounter any vertex from r„, i.e., Ti = {Du n S{v') = 0, for each v' € Ty}. Here we denote 
Du = Utt/gr^S'(u') and = Uy'^TvS{v'). Now assume that u,v are unlabeled vertices, i.e., 
u,v ^ A. Then 

|r„|,|r^| < (2M + l)a;(n) =: f, 

and \Du\, \Dy\ < MT < M{2M + \)oj{n) =: D. In this case, for each v' G T^, the probability 
that S{v') does not hit Du is at least ( "^^^ ) ■ Here we use the fact that S{v') has at most 
< M elements (trials) to hit the set Du which occupies \Du\ < D attributes among those 
(at least m — D) which have not been used by the current collection of vertices of evolving list 
L^. Since there are at most T vertices in T^, we obtain 

V{n\u,v iA)> (!Ii-^)^'^ = 1 - 0{u:\n)n-^). 

For arbitrary u,v we obtain from ()4ip 

P(^) > Fin n{u,v^ A}) = Pinlu, v ^ A)P{u, v ^A) = 1- o(l). (54) 

Now assume that PQ^pisu + 1) > (otherwise (|52p trivially follows from ()43p ) and write 

P{u, V £3')= P{v GB'lue B')P{u G B'). (55) 

We can replace P{v G B'^lu G B"") by := P{v G S'^lju G B""} f] {u,v G A} Ci Ti) and 
P{u G B'') by PQ^pisu + 1). It follows from (gl]), ([MI) and (gSD that the error due to such 
replacement is of order o(l). From ()55p we obtain 

P{u, veB')= Pv^uPqA^u + 1) + o(l). (56) 

Finally, ([52]) follows from ([56]) and the identity pv,u = PqA^^ + 1) + oiX), which is shown in 
much the same way as ()42p above. □ 

Proof of LemmaU^ Let (xi, . . . ,Xk) be a random permutation of elements of the set W. For 



A = {xi, . . . , Xa} we have, by symmetry, 




p{a,b,k) 


< P{xi eB)= aP{xi G B), 

l<i<a 


(57) 


Pi{a,b,k) 


= P{AnB = Xi) = aP{AnB = xi), 

l<i<a 


(58) 


P2{a,b,k) 


< J2 F{xi,Xj e B) = 2-'^a{a-l)P{xi,X2 e B), 


(59) 


p{a, b, h, k) 


= FiAnB = Xi)P{Hr\A = 9\AnB = xi) 

l<i<a 

= pi{a, 6, k) (l — p{a — l,h,k — b)) . 


(60) 
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The right-hand side inequahty of ([6]) follows from ()57p and the identity P(xi € i?) = h/k. The 
left-hand side inequality follows from ()58p combined with the identity P(^nS = xi) = ^^^(^^-^"^'^ 
and inequalities 

^ ^ (fc - 6)a_i ^ , k-a- b .g-i ab 
~ {k — l)a-i ~ k — a ~ k — a 

([7]) follows from ()59p and the identity P(a;i, X2 B) = dEl) follows from (|60p combined with 
([6]). ([9]) follows from the inequality pi{a, b, h, k) = pi{a, b, k)p{a — l,h,k — b), which is shown in 
the same way as (j60]) . □ 
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